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Stein Estimation for Spherically 
Symmetric Distributions: 
Recent Developments 

Ann Cohen Brandwein and William E. Strawderman 



Abstract. This paper reviews advances in Stein- type shrinkage estima- 
tion for spherically symmetric distributions. Some emphasis is placed 
on developing intuition as to why shrinkage should work in location 
problems whether the underlying population is normal or not. Consid- 
erable attention is devoted to generalizing the "Stein lemma" which 
underlies much of the theoretical development of improved minimax 
estimation for spherically symmetric distributions. A main focus is 
on distributional robustness results in cases where a residual vector 
is available to estimate an unknown scale parameter, and, in particu- 
lar, in finding estimators which are simultaneously generalized Bayes 
and minimax over large classes of spherically symmetric distributions. 
Some attention is also given to the problem of estimating a location 
vector restricted to lie in a polyhedral cone. 

Key words and phrases: Stein estimation, spherical symmetry, mini- 
maxity, admissibility. 



1. INTRODUCTION 

We are happy to help celebrate Stein's stunning, 
deep and significant contribution to the statistical 
literature. In 1956, Charles Stein (1956) proved a re- 
sult that astonished many and was the catalyst for 
an enormous and rich literature of substantial im- 
portance in statistical theory and practice. Stein 
showed that when estimating, under squared error 
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loss, the unknown mean vector ^ of a p-dimensional 
random vector X having a normal distribution with 
identity covariance matrix, estimators of the form 
(1 — a/{||X|p -|- b})X dominate the usual estimator 
0, X, for a sufficiently small and b sufficiently large 
when p > 3. James and Stein (1961) sharpened the 
result and gave an explicit class of dominating esti- 
mators, {I- a/\\Xp)X for < a < 2(p-2), and also 
showed that the choice of a = p — 2 (the James-Stein 
estimator) is uniformly best. For future reference re- 
call that "the usual estimator," X, is a minimax 
estimator for the normal model, and more gener- 
ally for any distribution with finite covariance ma- 
trix. 

Stein (1974, 1981), considering general estimators 
of the form S{X) = X -\-g{X), gave an expression for 
the risk of these estimators based on a key Lemma, 
which has come to be known as Stein's lemma. Nu- 
merous results on shrinkage estimation in the gen- 
eral spherically symmetric case followed based on 
some generalization of Stein's lemma to handle the 
cross product term Eg[{X — 9yg{X)] in the expres- 
sion for the risk of the estimator. 
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A substantial number of papers for the multivari- 
ate normal and nonnormal distributions have been 
written over the decades following Stein's monumen- 
tal results. For an earlier expository development of 
Stein estimation for nonnormal location models see 
Brandwein and Strawderman (1990). 

This paper covers the development of Stein esti- 
mation for spherically symmetric distributions since 
Brandwein and Strawderman (1990). It is not ency- 
clopedic, but touches on only some of the significant 
results for the nonnormal case. 

Given an observation, X, on a p-dimensional sphe- 
rically symmetric multivariate distribution with un- 
known mean, 6 and whose density is /(||x — 0|p) 
(for x,0 £ RP), we will consider the problem of esti- 
mating 9 subject to the squared error loss function, 
that is, 5{X) is a measurable (vector-valued) func- 
tion, and the loss given by 



(1.1) L{e,8) = \\5-e\ 
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where 6 = {61,62,- ■ ■ jSp)' and 9 = {61, 62,. 
The risk function of 6 is defined as 

R{9,6) = EeL{6{X),9). 

Unless otherwise specified, we will be using the loss 
defined by (1.1). Other loss functions such as the 
loss L{9,6) = \\6 — 9\^ ja^ will be occasionally used, 
especially when there is also an unknown scale pa- 
rameter, and minimaxity, as opposed to domination, 
is the main object of study. We will have relatively 
little to say about the important case of confidence 
set loss, or of loss estimation. 

In Section 2 we provide some additional intuition 
as to why the Stein estimator of the mean vector 9 
makes sense as an approximation to an optimal lin- 
ear estimator and as an empirical Bayes estimator in 
a general location problem. The discussion indicates 
that normality need play no role in the intuitive de- 
velopment of Stein-type shrinkage estimators. 

Section 3 is devoted to finding improved estima- 
tors of 9 for spherically symmetric distributions with 
a known scale parameter using results of Brandwein 
and Strawderman (1991) and Berger (1975) to bound 
the risk of the improved general estimator 6ijC) = 
X + dV^). 

Section 4 considers estimating the mean vector 
for a general spherically symmetric distribution in 
the presence of an unknown scale parameter, and, 
more particularly, when a residual vector is available 



to estimate the scale parameter. It extends some of 
the results from Section 3 to this case as well as 
presenting new improved estimators for this prob- 
lem. The results in this section indicate a remark- 
able robustness property of Stein-type estimators in 
this setting, namely, that certain of the improved 
estimators dominate X uniformly for all spherically 
symmetric distributions simultaneously (subject to 
risk finiteness). 

In Section 5 we consider the restricted param- 
eter space problem, particularly the case where 9 
is restricted to a polyhedral cane, or more gener- 
ally a smooth cone. The material in this section is 
adapted from Fourdrinier, Strawderman and Wells 
(2003). 

In Section 6 we consider some of the advancements 
in Bayes estimation of location vectors for both the 
known and unknown scale cases. We present an in- 
triguing result of Maruyama Maruyama (2003b) 
which is related to the (distributional) robustness of 
Stein estimators in the unknown scale case treated 
in Section 4. 

Section 7 contains some concluding remarks. 

2. SOME FURTHER INTUITION INTO STEIN 
ESTIMATION 

We begin by adding some intuition as to why 
Stein estimation is both reasonable and compelling, 
and refer the reader to Brandwein and Strawderman 
(1990) for some earlier developments. The reader is 
also referred to Stigler (1990) and to Meng (2005). 

2.1 Stein Estimators as an Approximation to the 
Best Linear Estimator 

The following is a very simple intuitive develop- 
ment for optimal linear estimation of the mean vec- 
tor in W that leads to the Stein estimator. 

Suppose Ee{X\ = 0, Cov(X) = a^I {a^ known), 
and consider the linear estimator of the form 6a{X) = 
(1 — a)X. What is the optimal value of a? The risk 
is given by 



Ri9,6a)=p{l 
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and the derivative, with respect to a, is 

{d/da}R{9, 6a) = 2{-p(l - a)a^ + a\\9f}. 

Hence, the optimal a is pa'^/{pa'^ + \\9\\'^) and the op- 
timal "estimator" is <5(X) = {l-pa'^/{pa'^ + \\9\\^})X, 
which is, of course, not an estimator because it de- 
pends on 9. 
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However, E0[\\Xf]=pa^ + \\ep, so l/\\Xp is 
a reasonable estimator of l/{pcr^ + ||0|p}. Hence, an 
approximation to the optimal linear "estimator" is 
6lx) = (1 -pa'^/\\X\\^)X which is the James-Stein 
estimator except that p replaces p — 2. Note that 
as p gets larger, ||X|p/j5 is likely to improve as an 
estimator of a + ^^-U- and, hence, we may expect 
that the dimension, p, plays a role. 

2.2 Stein Estimators as Empirical Bayes 
Estimators for General Location Models 

Strawderman (1992) considered the following gen- 
eral location model. Suppose X\6 ~ f{x — 6), where 
Eg[X] = 9, Cov(X) = (j2/ ((j2 known) but that /(•) 
is otherwise unspecified. Also assume that the prior 
distribution for 9 is given by f*"{9), the n fold con- 
volution of /(•) with itself. Hence, the prior distri- 
bution of 9 can be represented as the distribution 
of a sum of n i.i.d. variables Ui,i = 1,. . . ,n, where 
each u is distributed as f{u). Also, the distribution 
of Mo = (^ ~ ^) has the same distribution and is 
independent of the other u's. 

The Bayes estimator can therefore be thought of as 

6{X) = E[9\X] = E[9\X -9 + 9] 



= E 
and, hence, 

6{X) = nE 

n 



n n 
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i=0 

E[X\X] 



i=0 



n 



-X 



n+ 1 n + 1 

or, equivalently, 5{X) = E[9\X] = (1 - l/{n+l])X. 
Assuming that n is unknown, we may estimate 
it from the marginal distribution of X, which has 
the same distribution as X — 9 + 9 = Yll=Q'^i- ^^ 
particular, 

2n 
Ee[\\Xf] = E Y^ui 

i=0 



Y,E[\\uif] = {n+l)pa' 



i=0 



since E[ui] = and Cov(ni) = a'^I, £'[||nj|| 
Therefore, (n + 1) can be estimated by (pcj^)^^||X| 



pa 



Substituting this estimator of (n + 1) in the expres- 
sion for the Bayes estimator, we have an empirical 
Bayes estimator 

6{X) = {l-pa'/\\Xf)X, 

which is again the James-Stein estimator, save for 
the substitution of p for p — 2. 

Note that in both of the above developments, 
the only assumptions were that Eg{X) = 9, and 
Cov(X) = a^I. The Stein-type estimator thus ap- 
pears intuitively, at least, to be a reasonable esti- 
mator in a general location problem. 

3. SOME RECENT DEVELOPMENTS 

FOR THE CASE OF A KNOWN 

SCALE PARAMETER 

LetXr^ f{\\x-9f), the loss he L{9, 6) = \\6 -9 f 
so the risk is R{9,6) = E0[\\5{X) — 9\\^]. Suppose an 
estimator has the general form 5{X) = X -\- a'^g{X). 
Then 

R{9,6) = Eemx)-9f] 

= Ee[\\X + a'g{X)-9f] 



EolWX 



+ a^Eg[\\giX)\\' 



+ 2a^Eg[iX-9yg{X)]. 

In the normal case. Stein's lemma, given loosely as 
follows, is used to evaluate the last term. 

Lemma 3.1 [Stein (1981)]. If X r-. N{9,a'^I), 
thenEg[{X-9yg{X)]=a^Eg[Vg{X)] [whereYg{-) 
denotes the gradient of g{-)], provided, say, that g is 
continuously differentiahle and that all expected val- 
ues exist. 

Proof. The proof is particularly easy in one di- 
mension, and is a simple integration by parts. In 
higher dimensions the proof may just add the one- 
dimensional components or may be a bit more so- 
phisticated and cover more general functions, g. In 
the most general version known to us, the proof uses 
Stokes' theorem and requires g{-) to be weakly dif- 
ferentiahle. n 

Using the Stein lemma, we immediately have the 
following result. 

Proposition 3.1. If X r^ N{9,a^I), then 
R{9,X + a^g{X)) 



■EglWX 



+ a^Eg[\\g{X)f+2Vg{X)] 



and, hence, provided the expectations are finite, a suf- 
ficient condition for 6 {X) to dominate X is \\g{x)\\'^ + 
2V'g{x) < a.e. (with strict inequality on a set of 
positive measures). 
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The key to most of the hterature on shrinkage es- 
timation in the general spherically symmetric case 
is to find some generalization of (or substitution for) 
Stein's lemma to evaluate (or bound) the cross prod- 
uct term £'g[(X — 9yg{X)]. We indicate two useful 
techniques below. 

3.1 Generalizations of James-Stein Estimators 
Under Spherical Symmetry 

Brandwein and Strawderman (1991) extended the 
results of Stein (1974, 1981) to spherically symmet- 
ric distributions for estimators of the form X + ag{X). 
The following two preliminary lemmas are necessary 
to prove the result in Theorem 3.1. 

Lemma 3.2. Let X have a distribution that is 
spherically symmetric about 6. Then 

Eg[{X-eygiX)\\\X-9f = R^] 

= p~^R^AveB^R^e)^'g{X), 

provided g{x) is weakly dijjerentiable. 

Proof. Notation for this lemma: S{R,9) 
and B{R,6) are, respectively, the (surface of the) 
sphere and (solid) ball, of radius R centered at 9. 
Note also that {X — 9)/R is the unit outward nor- 
mal vector at X on S{R,9). Also da{X) is the area 
measure on S{R,6), while A{-) and V{-) denote area 
and volume, respectively. Since the conditional dis- 
tribution oi X — 9 given ||X — 0|p = i?^ is uniform 
on the sphere of radius R, it follows that 

Eg[{X-9yg{X)\\\X-ef = R^] 

= Ayes(^ji^e){{X-9yg{X)} 

R f {X- 9yg{X) 



A{S{R,9)) TsiR,e) 
R 



da{X) 



A{S{R,9))JBiR,e) 



R 



V' g{x) dx 



s.nce ^m^ = R/p 



R^ 



pV{B{R,9))JBiR 



A{S{R,9)) 
V'g{x) dx 

(by Stokes' theorem) 

D 



= p-^R^AyeB(R,9)Vg{X). 

The following result is basic to the study of su- 
perharmonic functions and is well known (see, e.g., 
du Plessis, 1970, page 54). 



Lemma 3.3. Leth{x) be superharmonic on S (R) , 
li-e-, Y.1=i{dVdx'i}h{x) < 0], then Aves^R,e)Hx) < 



Ave 



B{Fl,i 



)h{x) 



Consider, now, an estimator of the general form 
X + ag{X), where a is a scalar, and g{X) maps 
RP -^ RP. 

Theorem 3.1. Let X have a distribution that 
is spherically symmetric about 9. Assume the fol- 
lowing: 

1. \\gix)\\y2<-h{x)<-V'g{x), 

2. —h[x) is superharmonic, Eg[R'^h{W)] is nonin- 
creasing in R for each 9, where W has a uniform 
distribution on B{R,9), 

3. 0<a<l/{pi^o[l/||^f]}. 

Then X + ag{X) is minimax with respect to quadratic 
loss, provided g{-) is weakly differentiable and all ex- 
pectations are finite. 

Proof. 

R{9,X + ag{X))-R{9,X) 

= E[Ee[a^\\g{X)f 

+ 2a{X - 9yg{X)\\\X - 9f = R^]] 

<E[Ee[-2a'^h{X) 

+ 2a{X - 9yg{X)\\\X - 9f = R^]] 

= E[Ee[-2a^h{X)\\\X - 9f = R^] 

+ 2aE[{R^/p} AveB(R,e) V g{X)\R^]] 

< E[Ee[-2a^h{X)\\\X - 9f = R^] 

+ 2aEe[{R^/p}Eeh{W)\R^]] 

<E[E0[-2a^h{W)\R^] 

+ 2aEe[{R^/p}Egh{W)\R^]] 

(by Lemma 3.3) 

= 2aE[Ee[R^h{W)\R\-a/R'^ + l/p)] 

= 2aE[Eg[R^h{W)\R^]]E[-a/R^ + l/p] 

<0 

by the covariance inequality since Eg[R'^h{W)\R'^] 
is nonincreasing and —R~^ is increasing and since 
h<0. D 

Example 3.1. James-Stein estimators [g{x) = 
-2{p-2)x/\\x\\'^]: In this case both \\g{x)\\'^/2 and 



-'V'g{x) are equal to 2{p — 2) 



P. Conditions 1 



and 2 of Theorem 3.1 are satisfied for h(x) = —2{p — 
2)^/||x|p, provided p > 4: since ||x||~^ is superhar- 
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monic if p > 4, and since £'e[-R^/||X|p] = Eqi^[1/ 
||X|p] is increasing by Anderson's theorem. 

Hence, by condition 3, for any spherically sym- 
metric distribution, the James-Stein estimator (1 — 
a2{p-2)/\\Xf)X is minimax for < a < 1/{pEq[1/ 
||X|p]} and p>4:. The domination over X is strict for 
< a < l/{pEQ[l/\\Xf]], and also for a = 1/{pEq[1/ 
||X|p]}, provided the distribution is not normal. 

Baranchik (1970), for the normal case, considered 
estimators of the form (1 — ar(||X|p)/||X|p)X un- 
der certain conditions on r(-). Under the assumption 
that r(-) is monotone nondecr easing, bounded be- 
tween and 1, and concave. Theorem 3.1 applies to 
these estimators as well, and establishes minimaxity 
for < a < l/{pEQ[l/\\Xf]] and for p>A. 

We note in passing that the results in this subsec- 
tion hold for an arbitrary spherically symmetric dis- 
tribution with or without a density. The calculations 
rely only on the distribution of X conditional on 
\\X — d\\^ = R^, and, of course, finiteness of £^[||X|p] 
andE[\\giX)n 

3.2 A Useful Expression for the Risk of 
a James-Stein Estimator 

Berger (1975) gave a useful expression for the risk 
of a James-Stein estimator which is easily gener- 
alized to the case of a general estimator, provided 
the spherically symmetric distribution has a den- 
sity/(||x-ef). 

Some form of this generalization (and extensions 
to unknown scale case and the elliptically symmet- 
ric case) has been used by several authors, including 
Fourdrinier, Strawderman and Wells (2003), Four- 
drinier, Kortbi and Strawderman (2008), Fourdrinier 
and Strawderman (2008), Maruyama (2003a) and 
Kubokawa and Srivastava (2001), among others. 

Lemma 3.4. Suppose X r^ f{\\x — OW^), and let 
F{t) = 2~^ j^ f{u)du andQ{t) = F{t)/f{t). Then 

R{e,X + g{X)) 

= Ee[\\X-ef] 

+ Ee[MX)f + 2Q{\\X-9f)Vg{X)]. 

Proof. The lemma follows immediately with the 
following identity for the cross product term: 

E[{x - eyg{x)] 



ix-eygiX)f{\\x-9f)dx 



RP 



[ giXyVF{\\x-ef)dx 

J RP 



I V'g{X)F{\\x-ef)dx 

Jrp 



(by Green's theorem) 

= E[Q{\\X-ef)Vg{X)]. D 

Berger (1975), Maruyama (2003a) and Fourdrinier, 
Kortbi and Strawderman (2008) used the above re- 
sult for distributions for which Q{t) is bounded be- 
low by a positive constant. In this case, the next 
result follows immediately from Lemma 3.4. 

Theorem 3.2. Suppose X ^ f{\\x — 6\\'^), and that 
Q{t) > c > 0. Then the estimator X + g{X) domi- 
nates X provided \\g{x)\\'^ + 2cV'g(x) < for all x. 

Example 3.2. As noted by Berger (1975), if /(•) 
is a scale mixture of normals, then Q{t) is bounded 
below. To see this, note that if X\V r^ N{e,VI) 
and V ~ g{v), then f{t) = j^ {2^v)-pI'^ e^Y>{-t/ 
2v)g{v) dv. Similarly, 



F{t) 



Hence, 

Q{t) 



f{u)du 

/oo 
exp{-u/2v) du 

{27rvyP/^v exp{-t/2v)g{v) dv. 



joo ^{2-p)/2 exp{-t/2v)g{v) dv 
/o "^"^^^ exp(— t/2f )(7(v) dv 

J^v'-P/^giv)dv 



Et[V]>Eo[V] 



f^v P/'^g{v)dv 



£[yl-p/2] 



OO, 



E[V-P/^] 

where Et denotes expectation with respect to the 
density proportional to v^P'"^ exjp{—t/2v)g{v). The 
inequality follows since the family has monotone 
likelihood ratio in t. 

Hence, for the James-Stein class (1 — a/||X|p)X, 
this result gives dominance over X for 

0? - 2a(p - 2 — t^ -i < 

or 

^ryl-p/2l 

0<a<2(p-2)^^ -^. 

This bound on the shrinkage constant, a, compares 
poorly with that obtained by Strawderman (1974), 
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< a < 2(p - 2)/E[V-'^], which may be obtained 
by using Stein's lemma conditional on V and the 
fact that £^0[y/||X|p|l/] is monotone nondecreasing 
in V . Note that, again by monotone likelihood ratio 
properties (or the covariance inequality), 

It is therefore somewhat surprising that Maruyama 
(2003a) and Fourdrinier, Kortbi and Strawderman 
(2008) were able to use Theorem 3.2, applied to 
Baranchik-type estimators, to obtain generalized and 
proper Bayes minimax estimators. Without going 
into details, the advantage of the cruder bound is 
that it requires only that r(t) be monotone, while 
Strawderman's result for mixtures of normal distri- 
butions also requires that r{t)/t be monotone de- 
creasing. 

Other applications of Lemma 3.4 give refined 
bounds on the shrinkage constant in the James- 
Stein or Baranchik estimator depending on mono- 
tonicity properties oiQit). Typically, additional con- 
ditions are required on the function r{t) as well. See, 
for example, Brandwein, Ralescu and Strawderman 
(1993) (although the calculations in that paper are 
somewhat different than those in this section, the 
basic idea is quite similar). 

Applications of the risk expression in Lemma 3.4 
are complicated relative to those in the normal case 
using Stein's lemma, in that the mean vector, 0, re- 
mains to complicate matters through the function 
(5(||X — 6*11^). It is both surprising and interesting 
that matters become essentially simpler (in a cer- 
tain sense) when the scale parameter is unknown, 
but a residual vector is available. We investigate this 
phenomenon in the next section. 

4. STEIN ESTIMATION IN THE UNKNOWN 
SCALE CASE 

In this section we study the model (X, [/) ~ /(||x — 
^IP + ll^lP)) where dimX = dim^ = p, and dimC/ = 
k. The classical example of this model is, of course, 
the normal model f{t) = (^^)P+*^e-*/(2<^'). How- 



2-KC7' 

ever, a variety of other models have proven useful. 
Perhaps the most important alternatives to the nor- 
mal model in practice and in theory are the gener- 
alized multivariate-t distributions 

_ c / 1 ^' 

or, more generally, scale mixture of normals of the 
form 



fit) 



2Tra 



-J/{2<72) 



dG{a^ 



These models preserve the spherical symmetry 
about the mean vector and, hence, the covariance 
matrix is a multiple of the identity. Thus, the co- 
ordinates are uncorrelated, but they are not inde- 
pendent except for the case of the normal model. 
We look (primarily) at estimators of the form X + 
{\\U\\y{k + 2)}g{X). 

The main result may be interpreted as follows: 
If, when X ~ N{9,a'^I) (c^ known), the estimator 
X + a^g{X) dominates X, then, under the model 
{X,U) ~ f{\\x - 9f + ||u||2), the estimator X + 
{\\U\\'^/{k + 2)}g{X) dominates X. That is, substi- 
tuting the estimator ||C/|p/(A; + 2) for o"^ preserves 
domination uniformly for all parameters {6,cr'^) and 
(somewhat astonishingly) simultaneously for all dis- 
tributions, /(•). Note that, interestingly, ||f^|P/(fc + 
2) is the minimum risk equivariant estimator of cr^ 
in the normal case under the usual invariant loss. 
This wonderful result is due to Cellier and Four- 
drinier (1995). We refer the reader to their paper for 
the original proof based on Stokes' theorem applied 
to the distribution of X conditional on ||X — 0|p -|- 
||C/|P = R^. One interesting aspect of that proof is 
that even if the original distribution has no density, 
the conditional distribution of X does have a density 
for all fc > 0. 

We will approach the above result from two dif- 
ferent directions. The first approach is essentially 
an extension of Lemma 3.4. As in that case, the re- 
sulting expression for the risk still involves both the 
data and 9 inside the expectation, but the function 
Q(||X — 0|p + \\Up) is a common factor. This allows 
the treatment of the remaining terms as if they are 
an unbiased estimate of the risk difference. 

The second approach is due to Fourdrinier, Straw- 
derman and Wells (2003), and is attractive because 
it is essentially statistical in nature, depending on 
completeness and sufficiency. It may be argued also 
that this approach is somewhat more general in that 
it may be useful even when the function g{x) is not 
necessarily weakly differentiable. In this case an un- 
biased estimator of the risk difference is obtained 
which agrees with that in Cellier and Fourdrinier 
(1995). This is in contrast to the above method whe- 
reby the expression for the risk difference still has 
a factor QdJX — 0|p + ||[/|p) inside the expectation. 

Note. Technically, our use of the term "unknown 
scale" is somewhat misleading in that the scale pa- 
rameter may, in fact, be known. We typically think 
of /(•) as being a known density, which implies that 
the scale is known as well. It may have been preferab- 
le to write the density as {X, U) ~ {1/o-p+^}/({||x - 
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0|p + ||u|p}/(T^), emphasizing the unknown scale pa- 
rameter. This is more in keeping with the usual 
canonical form of the general linear model with spher- 
ically symmetric errors. What is of fundamental im- 
portance is the presence of the residual vector, ?7, in 
allowing uniform domination over the estimator X 
simultaneously for the entire class of spherically sym- 
metric distributions. Since the suppression of the 
scale parameter makes notation a bit simpler, we 
will, for the most part, use the above notation in 
this section. Additionally, we continue to use the un- 
normalized loss, L{6,5) = ||(^ — 0|p, and state results 
in terms of dominance over X instead of minimax- 
ity, since the minimax risk is infinite. In order to 
speak meaningfully of minimaxity in the unknown 
scale case, we should use a normalized version of the 
loss, suchas L{9,6) = \\6-e\\'^/a^. 

4.1 A Generalization of Lemma 3.4 

Lemma 4.1. Suppose {X,U) ~ /(||2;-6'|p-F||u|p), 
where dimX = dim^ = p, dimU = k. Then, pro- 
vided g(x, ||u|p) is weakly differentiable in each co- 
ordinate: 

1. Ee[\\U\\HX - 9yg{X, ||C/||2)] = Ee[\\UrV^g{X, 
\\U\\^)Q{\\X-e\\^ + \\U\\^)]. 

2. Ee[\\Ung{X,\\ur)r] = Ee[h{X,\\ur).Q{\\X- 
eW^ + WUf)], where Q(i) = {2/(t)}-i • J^°° f{s) ds 
and 

h{x, \\u\\^) 
(4.1) ={k + 2)\\ufMx)f 

Proof. The proof of part 1 is essentially the 
same as the proof of Lemma 3.4, holding U fixed 
throughout. The same is true of part 2, where the 
roles of X and U are reversed and one notes that 

VU\\ufu) = {k + 2)\\uf, 

VA{\\nfn)Mx,\\uf)f} = h{x,\\uf), 
which is given by (4.1), and, hence, 

Ee[\\U\\%{X,\\Uf)f] 

= Ee[{\\U\M'U\\g{X,\\U\\')f] 

= Eg[Vu{{\\U\M\\g{X,\\Uf)f} 

■ Qi\\X - 9f + \\Uf)] 

= Ee[h{X,\\Uf)Q{\\X-ef + \\Uf)]. □ 

One version of the main result for estimators of 
the form X + {\\Uf/{k + 2)}g{X) is the following 
theorem. 



Theorem 4.1. Suppose (X, U) is as in Lem- 
ma 4-i- Then: 

1. The risk of an estimator X + {\\U\\'^ /{k + 2)} g{X) 
is given by 

Ri9,X + {\\Uf/ik + 2)}g{X)) 

= Ee[\\X-ef] 



+ Ee 



k + 2 



\g{X)f + 2Vg{X)} 



■QiU 



\^ + \\Uf] 



2. X + {\\U\\'^/{k-\-2)}g{X) dominates X provided 
\\gix)\\+2Vgix)<0. 

Proof. Note that 

R{e,X + {\\Uf/ik + 2)}g{X)) 



Ee[\\X 

+ Eg 
Eg[\\X 

+ E, 



df] 



(fe + 2)2' 

J\UP 



k + 2 

3l|2l 



\9ix)r 

{X - 6)'g{X) 



\g{X)f+2V'g{X)] 

\ufQ{\\x - ef + \\uf) 



k + 2 

by successive application of parts 1 and 2 of Lem- 
ma 4.1. D 

Example 4.1. Baranchik-type estimators: Sup- 
pose the estimator is given by (1 — ||C/|pr(||X|p)/ 
{(/c -|- 2)||X|p})X, where r{t) is nondecreasing, and 
< r{t) < 2{p — 2), then for p > 3 the estimator 
dominates X simultaneously for all spherically sym- 
metric distributions for which the risk of X is fi- 
nite. This follows since, if g{x) 
then 

\\g{x)f + 2V'g{x) 

= r\\\xf)/\\xf 

-2{{p-2)r{\\xf)/\\xf -2r'{\\xf)} 



-3;r(||3;|P)/||x|P, 



<r'^{\\xf)/\\xf-2{p-2)r{\\x\ 



ixir <0. 



Example 4.2. James-Stein estimators: If 
r(||x|p) = a, the Baranchik estimator is a James- 
Stein estimator, and, since r'(t) = 0, the risk is given 
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by 



En\\\x-e\ 



+ 



c? - 2a{p - 2) 



■E 



k + 2 

\\U\\^ 

Wxp 



Qm 



\^ + \\uf) 



Just as in the normal case, a=p—2 is the uniformly 
best choice to minimize the risk. But here it is the 
uniformly best choice for every distribution. Hence, 
the estimator (l - {p - 2)\\U\\'^/{{k + 2)\\X\\^})X 
is uniformly best, simultaneously for all spherically 
symmetric distributions among the class of James- 
Stein estimators! 

A more refined version of Theorem 4. 1 which uses 
the full power of Lemma 4.1 is proved in the same 
way. We give it for completeness and since it is useful 
in the study of risks of Bayes estimators. 

Theorem 4.2. Suppose {X,U) is as in Lem- 
ma 4-1- Then, under suitable smoothness conditions 
on g{-): 

1. The risk of an estimator X + iWUW^ /{k + 2)] g{X, 
||C/|P) is given by 

R{e,X + {\\Uf/{k + 2)}giX,\\Uf)) 

= Ee[\\X-ef] 

+ Eg[{{k + 2)-'\\Uf\\giX,\\Uf)f 

+ 2Vxg{X,\\Uf) 

+ 2{k + 2)-^\\u\Wd/d\\uf) 
■\\g{X,\\Uf)f} 
.Q{\\x-ef + \\uf)], 

2. X + {||C/||V(A: + 2)}5(X,||C/f ) dominates X pro- 



vided 



|5(x,||nf)f + 2V^<7(x,||nf) 

IL,I|2 Q 



+ 2 



k + 2d\\uf- 



\9{x,\u\ 



<0. 



Corollary 4.1. Suppose 5{X,\\U\\'^) = (1 - 
\\U\\'^r{\\X\\y\\Uf)/\\X\\'^)X. Then 6{X,\\U\\^) do- 
minates X provided: 

1. 0<r(-)<2(p-2)/(fe + 2) and 

2. r(-) is nondecreasing. 

The result follows from Theorem 4.2 by a straight- 
forward calculation. 



4.2 A More Statistical Approach Involving 
Sufficiency and Completeness 

We largely follow Fourdrinier, Strawderman and 
Wells (2003) in this subsection. The nature of the 
conclusions for estimators is essentially as in Theo- 
rem 4.1, but the result is closer in spirit to the result 
of Cellier and Fourdrinier (1995) in that we obtain 
an unbiased estimator of risk difference (from X) 
instead of the expression in Theorem 4.1 where the 
function Q(-), which depends on 6, intervenes. The 
following lemma is the key to this development. 

Lemma 4.2. Let iX,U) ~ f{\\x - e\\^ + ||iif ), 
where dim X = dim 9 = p and dim U = k. Suppo- 
se g{-) and h{-) are such that when X r^ Np(9,L), 
Ee[{X - eygiX)] = Ee[hiX)]. Then, for {X,U) as 
above, 

Ee[\\U\\\X-9yg{X)] 

= {l/ik + 2)}Eg[\\U\\'h{X)], 

provided the expectations exist. 

Note. Typically, of course, h{x) is the diver- 
gence of g{x), and, in all cases known to us, this 
remains essentially true. We choose this form of ex- 
pressing the lemma because in certain instances of 
restricted parameter spaces the lemma applies even 
though the function g{-) may not be weakly differen- 
tiable, but the equality still holds for g{x)LA{g{x)) 
and h{x) = 'Vg{x)Lyi{g{x)), where /a(") is the indi- 
cator function of a set A. 

Proof of Lemma 4.2. Suppose first, that the 
distribution of (X, U) is Np+k{{0, 0}, a^L) and that 9 
is considered known. Then by the independence oi X 
and U we have by assumption that 

Eg[{X-9yg{X)] 

= Ee[il/k)\\Uf{X-9yg{X)] 

= Ee[{k{k + 2)}~^\\Ufh{X)]. 

Hence, the claimed result of the theorem is true for 
the normal case. Now use the fact that in the normal 
case (for 9 known), \\X — 9\\'^ + ||C/|p is a complete 
sufficient statistic. So it must be that 

Ee[\\Uf{X-9yg{X)\\\X-9f + \\Uf] 



Eg 



\Ufh{X) 



k + 2 



lx-6'ir + llc/ll 



for all \\X — 0|p + ||t^|P except on a set of measure 
0, since each function of \\X — 0|p + ||t/|P has the 
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same expected value. Actually, it can be shown that 
these conditional expectations are continuous in R 
and, hence, they agree for all R (see Fourdrinier, 
Strawderman and Wells, 2003). 

But the distribution of (X, U) conditional on \\X — 
^IP + ||C/|p = i?^ is uniform on the sphere centered 
at {9, 0) of radius i?, which is the same as the con- 
ditional distribution of [X^U) conditional on \\X — 
^|p+||[/|p=i?^ for any spherically symmetric dis- 
tribution. Hence, the equality which holds for the 
normal distribution holds for all distributions /(•). 
D 

Lemma 4.2 immediately gives the following un- 
biased estimator of risk difference and a condition 
for dominating X for estimators of the form 6{X) = 
X + {\\Ur/{k + 2)]g{X). 

Theorem 4.3. Suppose {X,U),g{x) and h{x) 
are as in Lemma 4-2. Then, for the estimator S{X) = 
X + {\\Ur/{k + 2)}giX): 

The risk difference is given by 

R{e,5)-Ee[\\X-9f] 



1 



Eg 



;{\\g{X)f + 2Vg{X)} 



.(A; + 2)2 

2. 6{X) beats X provided \\g{x)\\'^ + 2\/'g{x) < 0, 
with strict inequality on a set of positive measure, 
and provided all expectations are finite. 

5. RESTRICTED PARAMETER SPACES 

We consider a simple version of the general restric- 
ted parameter space problem which illustrates what 
types of results can be obtained. Suppose {X, U) is 
distributed as in Theorem 4.1 but it is known that 
Oi>0, i = l, . . . ,p, that is, 6 G R^ the first orthant. 
What follows can be generalized to the case where 6 
is restricted to a polyhedral cone, and more gener- 
ally a smooth cone. The material in this section is 
adapted from Fourdrinier, Strawderman and Wells 
(2003). 

In the normal case, the MLE of 9 subject to the 
restriction that 6 G R^ is X_|_, where the ith com- 
ponent is Xi if Xj > and otherwise. Here, as in 
the case of the more general restriction to a convex 
cone, the MLE is the projection of X onto the re- 
stricted cone. Chang (1982) considered domination 
of the MLE of 6 when X has a Np{9, 1) distribution 
and 9 G R^ via certain Stein-type shrinkage estima- 
tors. Sengupta and Sen (1991) extended Chang's re- 
sults to Stein-type shrinkage estimators of the form 
6{X) = (1 - r,(||X+||2)/||X+||2)X+, where r,(-) is 



nondecreasing, and < rs{-) <2{s — 2)^, and where s 
is the (random) number of positive components of X. 
Hence, shrinkage occurs only when s, the number 
of positive components of X , is at least 3 and the 
amount of shrinkage is governed by the sum of squa- 
res of the positive components. A similar result holds 
if 9 is restricted to a general polyhedral cone whe- 
re X+ is replaced by the projection of X onto the 
cone and s is defined to be the dimension of the face 
onto which X is projected. 

We choose the simple polyhedral cone 9 £ R^ be- 
cause it will be reasonably clear that some version 
of the Stein Lemma 3.1 applies in the normal case. 
We first indicate a convenient, but complicated look- 
ing, alternate representation of an estimator of the 
above form in this case. Denote the n = 2P orthants 
of T^ , by Oi,...,On, and let 0\ be -R+. Then we 
may rewrite (a slightly more general version of) the 
above estimator as 



'^w = E 1 



nmAX)f) 
WnxW 



P,iX)Io^iX), 



where Pi{X) is the linear projection of X onto Fi, 
where Fi is the s-dimensional face of -R+ = Oi onto 
which Oi is projected. Note that if ri(-) = 0, Vi, the 
estimator is just the MLE. 

Lemma 5.1. Suppose X ~ Np{9,I), and let 
each rj(-) be smooth and bounded. Then: 

1. For each Oi, {ri{\\Pi{x)p)/\\P,{x)f}Pi{x)IoM 
is weakly differentiable in x. 

2. Further, 

.n{\\nx)f) 



Ee 



{nx)-9) 

Eg 



nP^{XW 
{s - 2)r,i\\P,iX)r) 

wmxw 



P,{X)IoAX) 



+ 2r'i{\\n{X)f)\loAX) 



provided expectations exist. 

3. 5{x) = Er=i{i - n{m{x)r)/\\Pi{x)r] ■ 

Pi{X)Ioi{X) as given above dominates the 
MLE X+ , provided ri is nondecreasing and boun- 
ded between and 2{s — 2)4.. 

Proof. Weak differentiability in part 1 follows 
since the function is smooth away from the bound- 
ary of Oi and is continuous on the boundary except 
at the origin. Part 2 follows from Stein's Lemma 3.1 
and the fact that (essentially) Pi{X) ~ Ns{9,a'^L), 
since n — s of the coordinates are 0. Part 3 follows 
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by Stein's Lemma 3.1 as in Proposition 3.1 applied 
to each orthant. We omit the details. The reader is 
referred to Sengupta and Sen (1991) or Fourdrinier, 
Strawderman and Wells (2003) for details in the 
more general case of a polyhedral cone. D 

Next, essentially applying Lemma 4.2 to each or- 
thant and using Lemma 5.1 we have the following 
generalization to the case of a general spherically 
symmetric distribution. 

Theorem 5.1. Let {X,U) ^^ f{\\x - eW^ + WuW^) 
where dim X = dim 9 = p and dim U = k and sup- 
pose that G i?!L . Then 



<^w=y: 1 



i=l 



\\u\?nm{x)f) , 



dominates the Xj^, provided ri is nondecreasing and 
bounded between and 2{s — 2)+. 

6. BAYES ESTIMATION 

There have been advancements in Bayes estima- 
tion of location vectors in several directions in the 
past 15 years. Perhaps the most important advance- 
ments have come in the computational area, particu- 
larly Markov chain Monte Carlo (MCMC) methods. 
We do not cover these developments in this review. 

Admissibility and inadmissibility of (generalized) 
Bayes estimators in the normal case with known 
scale parameter was considered in Berger and Straw- 
derman (1996) and in Berger, Strawderman and 
Tang (2005) where Brown's (1971) condition for ad- 
missibility (and inadmissibility) was applied for a va- 
riety of hierarchical Bayes models. Maruyama and 
Takemura (2008) also give admissibility results for 
the general spherically symmetric case. At least for 
spherically symmetric priors, the conditions are, es- 
sentially, that priors with tails no greater than 
0(ll^ll~ ) gi'^6 admissible procedures. 

Fourdrinier, Strawderman and Wells (1998), us- 
ing Stein's (1981) results (especially Proposition 3.1 
above, and its corollaries), give classes of minimax 
Bayes (and generalized Bayes) estimators which in- 
clude scaled multivariate-i priors under certain con- 
ditions. Berger and Robert (1990) give classes of pri- 
ors leading to minimax estimators. Kubokawa and 
Strawderman (2007) give classes of priors in the 
setup of Berger and Strawderman (1996) that lead to 
admissible minimax estimators. Maruyama (2003a) 
and Fourdrinier, Kortbi and Strawderman (2008), in 
the scale mixture of normal case, find Bayes and ge- 
neralized Bayes minimax estimators, generalizing re- 
sults of Strawderman (1974). As mentioned in Sec- 



tion 3, these results use either Berger's (1975) result 
(a version of which is given in Theorem 3.2) or Straw- 
derman's (1974) result for mixtures of normal distri- 
butions. Fourdrinier and Strawderman (2008) pro- 
ved minimaxity of generalized Bayes estimators cor- 
responding to certain harmonic priors for classes of 
spherically symmetric sampling distributions which 
are not necessarily mixtures of normals. The results 
in this paper are not based directly on the discussion 
of Section 3 but are somewhat more closely related 
in spirit to the approach of Stein (1981). 

We give below an intriguing result of Maruyama 
(2003b) for the unknown scale case (see also Maruya- 
ma and Strawderman, 2005), which is related to the 
(distributional) robustness of Stein estimators in the 
unknown scale case treated in Section 4. First, we 
give a lemma which will aid in the development of 
the main result. 

Lemma 6.1. Suppose {X,U) ~ 7/^^+'')/^ •/(??{ ||x- 
^IP + ||ti|p}), the (location-scale invariant) loss is 
given by L{{6,r]},6) = r]\\6 — OW^ and the prior dis- 
tribution on {9,r]) is of the form 'K{0,rj) = p{9)r]^ . 
Then provided all integrals exist, the generalized 
Bayes estimator does not depend on /(•). 

Proof. 

6{X,U) 

= E[97]\X,U]/E[ri\X,U] 

) 

0^{p+k)/2+B+l 



■fiv{\\X-9\\^ + \\Uf})pi9)diTd9 



BP JO 



RP JO 



rj 



{p+k)/2+B+l 



fiv{\\x 



+ \\Uf})p{9)drjde 



-1 



Making the change of variables w = r]{\\X — 9\\'^ + 
||C/|P), we have 



S{X,U) 



f ^(||^_^||2^||[;||2)-(p+fc)/2+B+2 
J RP 

POO 

■p{e)d9 w^P+^^'^+^+^f{w)dw 



f (||X-6'f -F||C/f )"(P+'')/2+^+^ 

Jrp 
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/>oo "I — i 

■p{0)d9 w^P+''^/^+^+^f{w)dw 
Jo 

/^, 9i\\X - ef + ||C/||2)-(P+fc)/2+S+2^(0) ^Q 



■exp(-r/{||x-^f + ||uf}/2) 



D 



• exp 



1 /"OO 



V 2(1 -A) 



d9 dr] dX 



Hence, for (generalized) priors of the above form, 
the Bayes estimator is independent of the samphng 
distribution provided the Bayes estimator exists; 
thus, they may be calculated for the most conve- 
nient density, which is typically the normal. Our 
next lemma calculates the generalized Bayes esti- 
mator for a normal sampling density and for a class '~ i/l i Ji 



= K' I I ^V2+fc/2+a;^V2-l(l__X)P/2-f'/2-l 

Jo Jo 

•exp(-77{A||a;|p + \\uf}/2)dr]dX 

= K f\x\\xf + \\uf)-b/2-k/2-a-lyh/2-l 

Jo 

.{i-xfi^~'>'^-Ux. 

Hence, we may express the Bayes estimator as 5{X, 



of priors for which p(-) is a scale mixture of normals. 

Lemma 6.2. Suppose the distribution of{X, U) is 
normal with variance a'^ = l/ry. Suppose also that the 
conditional distribution of 9 given rj and X is nor- 
mal with mean and covariance (1 — X)/{r]X)I , and 
the density of (r/. A) is proportional to ry''/2"P/2+a . 

;^b/2-p/2-l(l _ ;^)-6/2+p/2-l^ ^/^g^g < A < 1. 

1. Then the Bayes estimator is given by (1 —r(W)/ 
W)X , where VF=||X||2/||[/||2 andr{w) is given by 



g{x,u) 



V, 



"(A||xf + ||^||2)-V2-/c/2-a-l 



.A''/2-l(l-Af/2-V2-l^;^ 



2{d/d\\u\ 
\x\\x\\- 



\2\-h/2-k/2-a-l 



r(w) = W 



(6.1) 



AV2(i _ )^Y/2~b/2^l 

•(l + tx;A)-^/2-'^-^/2-2dA 

A''/2-1(i_A)p/2-V2-1 

■ {i+wxr^i^-'^-'/^-^dx 



-\ -1 



This is well defined for <b <p, and k/2 + a + 
6/2 + 2 >0. 
2. Furthermore, this estimator is generalized Bayes 
corresponding to the generalized prior proportional 
to 'r]"'\\9\\~ , for any spherically symmetric den- 
sity /(•) for which J^ tik+p)/2+a+lj^^^ dt <oo. 

Proof. Part 1. In the normal case, 



.AV2-1(i_a)W2-V2-i^A 

Jo 

.A''/2(i-A)P/2-V2-i^A 

(A||xf + ||t,||2)-V2-/=/2-a-2 



.;^fe/2-l(l_A)P/2-V2-l^A 
/"\Au;+l)-^/2-/c/2-a-2 

Jo 

.AV2(i_a)p/2-V2-i^A 

(Au; + l)-''/2-'=/2-a-2 



5{X, U)=X + 



X 



E[7ji9-X)\X,U] 
E[rj\X,U] 

Vx7n{X,U) 



2{d/d\\UW^)m{X,U)' 
where the marginal m{x,u) is proportional to 



1 poo 



JO 



b/2+k/2+p/2+ayb/2-l/^ _ y\-b/2-l 



RP 



.AV2-1(1_A)p/2-6/2-1^;^' 

= r(w). 

w 

Part 2. A straightforward calculation shows that 

the unconditional density of {9,r]) is proportional to 

7y"||0||~ . Hence, part 2 follows from Lemma 6.1. □ 

The following lemma gives properties of r{w). 
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Lemma 6.3. Suppose <b<p-2 and that k/2 + 
a + 1 > 0. Then, (1) r{w) is nondecreasing, and 
(2) < r{w) < b/{k + 2a + 2). 

Proof. By a change of variables, letting v = Xw 
in (6.1), then 



r{w) 



{v + l) 



-b/2-k/2-a~2 



■v'''\l-v/wY'^~''^-Uv 



-b/2-k/2-a-2 



pw 
/ (^ + 1) 

Jo 



n -1 



So, we may rewrite r{w) as -^^[v], where v has den- 
sity proportional to (1 + 7;)-''/2-^/2-a-2^V2-i(x — 
v/wY''^~^''^~^I]fj^^{v). This density has increasing 
monotone likelihood ratio in w as long as p/2 — 6/2 — 
1 > 0. Hence, part 1 follows. 

The conditions of the lemma allow interchange of 
limit and integration in both numerator and denom- 
inator of r{w) as w — )• oo. Hence, 



r{w) < 



J^°^(l + ^;)-V2-fc/2-a-2^fe/2^^ 
J^^l + yyb/2-k/2-a~2yb/2-l ^y 

/onV2-i(i_n)'=/2+»+i(iu 

[letting u = v/{v + 1)] 

Beta(6/2 + l,/c/2 + a+l) 
Beta(6/2, A;/2 + a + 2) 

6/2 



D 



A;/2 + a+l' 

Combining Lemmas 6.1-6.3 with Corollary 4.1 gi- 
ves as the main result a class of estimators which are 
generalized Bayes and minimax simultaneously for 
the entire class of spherically symmetric sampling 
distributions (subject to integrability conditions). 

Theorem 6.1. Suppose that the distribution 
of (X, U) and the loss function are as in Lemma 6. 1, 
and that the prior distribution is as in Lemmas 6.2 
and 6.3 with a satisfying b/{k + 2a + 2) < 2{p — 
2)/{k + 2), and with < b < p — 2. Then the cor- 
responding generalized Bayes estimator is minimax 
for all densities /(•) such that the 2{a-\-2)th mo- 
ment of the distribution of (X, U) is finite, that is, 
E{R^^+^) <oo. 



We note that the above finiteness condition, 
£'(i?^""'"^) < oo, is equivalent to the finiteness con- 
dition, /~ i(fc+p)/2+a+i j(i) ^f. < oo, in Lemma 6.2. 

7. CONCLUDING REMARKS 

This paper has reviewed some of the developments 
in shrinkage estimation of mean vectors for spheri- 
cally symmetric distributions, mainly since the re- 
view paper of Brandwein and Strawderman (1990). 
Other papers in this volume review other aspects of 
the enormous literature generated by or associated 
with Stein's stunning inadmissibility result of 1956. 

Most of the developments we have covered are, or 
can be viewed as, outgrowths of Stein's papers of 
1973 and 1981, and, in particular, of Stein's lemma 
which gives (an incredibly useful) alternative expres- 
sion for the cross product term in the quadratic risk 
function. 

Among the topics which we have not covered is the 
closely related literature for elliptically symmetric 
distributions (see, e.g., Kubokawa and Srivastava, 
2001, and Fourdrinier, Strawderman and Wells, 2003, 
and the references therein). We also have not in- 
cluded a discussion of Hartigan's (2004) beautiful 
result that the (generalized or proper) Bayes esti- 
mator of a normal mean vector with respect to the 
uniform prior on any convex set in RP dominates 
X for squared error loss. Nor have we discussed the 
very useful and pretty development of the Kubokawa 
(1994) IBRD method for finding improved estima- 
tors, and, in particular, for dominating James Stein 
estimators (see also Marchand and Strawderman, 
2004, for some discussion of these last two topics). 
We nonetheless hope we have provided some intu- 
ition for, and given a flavor of the developments and 
rich literature in the area of improved estimators for 
spherically symmetric distributions. 

The impact of Stein's beautiful 1956 result and his 
innovative development of the techniques in the 1973 
and 1981 papers have inspired many researchers, 
fueled an enormous literature on the subject, led to 
a deeper understanding of theoretical and practical 
aspects of "sharing strength" across related studies, 
and greatly enriched the field of Statistics. Even so- 
me of the early (and later) heated discussions of the 
theoretical and practical aspects of "sharing strength" 
across unrelated studies have had an ultimately posi- 
tive impact on the development of hierarchical mod- 
els and computational tools for their analysis. We 
are very pleased to have been asked to contribute to 
this volume commemorating fifty years of develop- 
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merit of one of the most profound results in the Sta- 
tistical literature in the last half of the 20th century. 
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